Dynamic evidence models in a DBN phone recognizer

نویسندگان

  • William Schuler
  • Timothy A. Miller
  • Stephen T. Wu
  • Andrew Exley
چکیده

This paper describes an implementation of a discriminative acoustical model – a Conditional Random Field (CRF) – within a Dynamic Bayes Net (DBN) formulation of a Hierarchic Hidden Markov Model (HHMM) phone recognizer. This CRF-DBN topology accounts for phone transition dynamics in conditional probability distributions over random variables associated with observed evidence, and therefore has less need for hidden variable states corresponding to transitions between phones, leaving more hypothesis space available for modeling higher-level linguistic phenomena such syntax and semantics. The model also has the interesting property that it explicitly represents likely formant trajectories and formant targets of modeled phones in its random variable distributions, making it more linguistically transparent than models based on traditional HMMs with conditionally independent evidence variables. Results on the standard TIMIT phone recognition task show this CRF evidence model, even with a relatively simple first-order feature set, is competitive with standard HMMs and DBN variants using static Gaussian mixture models on MFCC features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Tandem BLSTM-DBN Architecture for Keyword Spotting with Enhanced Context Modeling

We propose a novel architecture for keyword spotting which is composed of a Dynamic Bayesian Network (DBN) and a bidirectional Long Short-Term Memory (BLSTM) recurrent neural net. The DBN is based on a phoneme recognizer and uses a hidden garbage variable as well as the concept of switching parents to discriminate between keywords and arbitrary speech. Contextual information is incorporated by ...

متن کامل

Improving Keyword Spotting with a Tandem BLSTM-DBN Architecture

We propose a novel architecture for keyword spotting which is composed of a Dynamic Bayesian Network (DBN) and a bidirectional Long Short-Term Memory (BLSTM) recurrent neural net. The DBN uses a hidden garbage variable as well as the concept of switching parents to discriminate between keywords and arbitrary speech. Contextual information is incorporated by a BLSTM network, providing a discrete...

متن کامل

Using a DBN to integrate sparse classification and GMM-based ASR

The performance of an HMM-based speech recognizer using MFCCs as input is known to degrade dramatically in noisy conditions. Recently, an exemplar-based noise robust ASR approach, sparse classification (SC) was introduced. While very successful at lower SNRs, the performance at high SNRs suffered when compared to HMM-based systems. In this work, we propose to use a Dynamic Bayesian Network (DBN...

متن کامل

Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition

Recently, deep learning techniques have been successfully applied to automatic speech recognition tasks -first to phonetic recognition with context-independent deep belief network (DBN) hidden Markov models (HMMs) and later to large vocabulary continuous speech recognition using context-dependent (CD) DBN-HMMs. In this paper, we report our most recent experiments designed to understand the role...

متن کامل

Sensor Validation Using Dynamic Belief Networks

The trajectory of a robot is monitored in a restricted dynamic environment using light beam sensor data. We have a Dynamic Belief Network (DBN), based on a discrete model of the domain, which provides discrete mon­ itoring analogous to conventional quantita­ tive filter techniques. Sensor observations are added to the basic D BN in the form of specific evidence. However, sensor data is often pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006